We consider semi-supervised binary classification for applications in which data points are naturally grouped (e.g., survey responses grouped by state) and the labeled data is biased (e.g., survey respondents are not representative of the population). The groups overlap in the feature space and consequently the input-output patterns are related across the groups. To model the inherent structure in such data, we assume the partition-projected class-conditional invariance across groups, defined in terms of the group-agnostic feature space. We demonstrate that under this assumption, the group carries additional information about the class, over the group-agnostic features, with provably improved area under the ROC curve. Further assuming invariance of partition-projected class-conditional distributions across both labeled and unlabeled data, we derive a semi-supervised algorithm that explicitly leverages the structure to learn an optimal, group-aware, probability-calibrated classifier, despite the bias in the labeled data. Experiments on synthetic and real data demonstrate the efficacy of our algorithm over suitable baselines and ablative models, spanning standard supervised and semi-supervised learning approaches, with and without incorporating the group directly as a feature.
translated by 谷歌翻译
我们考虑了解释一类可拖动的深层概率模型,总和网络(SPN)的问题,并提出了一种算法EXSPN来生成解释。为此,我们定义了特定于上下文的独立树(CSI-Tree)的概念,并提出了将SPN转换为CSI-Tree的迭代算法。最终的CSi-Tree既可以解释又可以解释为域专家。我们通过提取由SPN编码的条件独立性并近似SPN指定的局部环境来实现这一目标。我们对合成,标准和现实世界中临床数据集的广泛经验评估表明,CSI-Tree具有较高的解释性。
translated by 谷歌翻译
In this technical note, we introduce an improved variant of nearest neighbors for counterfactual inference in panel data settings where multiple units are assigned multiple treatments over multiple time points, each sampled with constant probabilities. We call this estimator a doubly robust nearest neighbor estimator and provide a high probability non-asymptotic error bound for the mean parameter corresponding to each unit at each time. Our guarantee shows that the doubly robust estimator provides a (near-)quadratic improvement in the error compared to nearest neighbor estimators analyzed in prior work for these settings.
translated by 谷歌翻译
选项代表了在增强学习(RL)中跨多个时间尺度推理的框架。由于最近对RL研究社区无监督学习范式的积极兴趣,该期权框架的适应性适应了授权的概念,这与代理商对环境的影响量相对应,并感知这种影响力和这种影响力的能力,可以在环境奖励结构提供的任何监督的情况下进行优化。许多最近的论文以各种方式修改了这一概念,从而取得了值得称赞的结果。但是,通过这些各种修改,授权的初始背景通常会丢失。在这项工作中,我们通过原始授权原则的角度提供了对此类论文的比较研究。
translated by 谷歌翻译
We consider after-study statistical inference for sequentially designed experiments wherein multiple units are assigned treatments for multiple time points using treatment policies that adapt over time. Our goal is to provide inference guarantees for the counterfactual mean at the smallest possible scale -- mean outcome under different treatments for each unit and each time -- with minimal assumptions on the adaptive treatment policy. Without any structural assumptions on the counterfactual means, this challenging task is infeasible due to more unknowns than observed data points. To make progress, we introduce a latent factor model over the counterfactual means that serves as a non-parametric generalization of the non-linear mixed effects model and the bilinear latent factor model considered in prior works. For estimation, we use a non-parametric method, namely a variant of nearest neighbors, and establish a non-asymptotic high probability error bound for the counterfactual mean for each unit and each time. Under regularity conditions, this bound leads to asymptotically valid confidence intervals for the counterfactual mean as the number of units and time points grows to $\infty$.
translated by 谷歌翻译
我们在无限地平线马尔可夫决策过程中考虑批量(离线)策略学习问题。通过移动健康应用程序的推动,我们专注于学习最大化长期平均奖励的政策。我们为平均奖励提出了一款双重强大估算器,并表明它实现了半导体效率。此外,我们开发了一种优化算法来计算参数化随机策略类中的最佳策略。估计政策的履行是通过政策阶级的最佳平均奖励与估计政策的平均奖励之间的差异来衡量,我们建立了有限样本的遗憾保证。通过模拟研究和促进体育活动的移动健康研究的分析来说明该方法的性能。
translated by 谷歌翻译